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Abstract 

Given a binary string, can one find a short description for it? The well- 
known answer is "no, Kolmogorov complexity is not computable." Faced with 
this barrier, one might instead seek a short list of candidates which includes 
a laconic description. Remarkably we find that efficiently computable short 
lists do exist, and we discuss the extent to which one can obtain them. Along 
the way, we employ expander graphs and randomness dispersers to obtain an 
explicit online matching theorem for bipartite graphs. 

1 The quest for short descriptions 

We explore an interaction between randomness extraction, combinatorics, and 
Kolmogorov complexity culminating in Corollary [9j The following classical result, 
stated in the language of Definition [TJ dates back to 1935. 

Hall's theorem ([4]). Let G = (L, R, E) be a bipartite graph. Then G is an (s, 1)- 

expander if and only for any set S C L of size at most s, there exists a bisection 
between S and some subset of E(S) whose graph consists of pairs from E. 

We shall prove an explicit version of Hall's theorem in the spirit of Musatov, 
Romashchenko, and Shen's recent "online matchings" [6] (see Definition [HJ) . 

We now digress to the primary topic of this manuscript. The Kolmogorov com- 
plexity of a binary string is the length of the shortest input that computes it via 
some standard machine [5]. As much as one might like to know the Kolmogorov 
complexity of a given string, it is impossible to obtain this quantity effectively [5]. 
Even estimating Kolmogorov complexity for a given string is infeasibldj]. More- 
over, any algorithm mapping a string of length n to a list of values containing its 



Indeed, suppose that some computable function / estimated Kolmogorov complexity for 
strings within a multiplicative factor d. That is, for every binary string x 

%fi-<C(x)<d-f{x), 
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Kolmogorov complexity must, for infinitely many n, include in each list at least a 
fixed fraction of the lengths below n + O(l) |2J. 

Remarkably, the situation differs when we seek a short list of candidate de- 
scriptions for a given string. We will show that it is possible to efficiently compute 
a polynomial length list containing a shortest description for any given string, up 
to an additive constant number of bits. Bauwens, Makhlin, Vereshchagin, and 
Zimand [T] previously showed that one can efficiently compute, for any string x, a 
polynomial length list of elements one of which is a description for x whose length 
does not exceed a shortest description plus (9(log|x|) bits. The existence of our 
listing algorithm will follow from our explicit version of Hall's theorem, and we 
devote the remainder of our discussion to establishing this crucial result. 

2 Randomness extraction tools 

Our construction will combine the expander graph of Guruswami, Umans, and 
Vadhan [3] with the disperser graph of Ta-Shma, Umans, and Zuckerman [7j. A 
graph is called explicit if the z th neighbor of each vertex can be computed in time 
polynomial in the size of the graph. Triplets (L, R, E) denote bipartite graphs 
in which L is a set of left-hand vertices, R is a set of right-hand vertices, and 
E C L x R is a set of edges connecting these two halves. A bipartite graph has 
left degree d if each of its left-hand vertices has exactly d neighbors in R. Finally, 
E(S) denotes S"s neighbors in R, and |X| denotes the cardinality of a set X. 

Definition 1. A bipartite graph (L,R,E) is called a (K, /?)- expander if for every 
set S C L of size at most K, \E(S)\ > 0\S\. 

Theorem 2 (Guruswami, Umans, and Vadhan j3]). There exists a constant c 
such that for every a, e > and every < k < n, there is an explicit bipartite 
graph (L, R, E) with 



which is a [2 k , (1 — e)d] -expander. 



where C(x) denotes the Kolmogorov complexity of x. There is a string x of each length satisfying 
C(x) > |a;| [5], and for all such x, we have f(x) > C(x)/d > \x\/d. Let g(n) be a computable 
function which gives the first string found at each length satisfying f(x) > \x\/d. Then C[g(n)] > 
f[g(n)]/d > \g(n)\/d 2 for all n. But this can only be true for finitely many n because g(n) is 
describable in logn + O(l) bits, a contradiction. 



• \L\ = 2 




• \R\ < d 2 ■ 2 fc ( 1+a ) 
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Definition 3. A bipartite graph (L,R,E) is called a (K,e)-disperser if every 
subset S C L of cardinality at least if has at least (1 — e)|i?| distinct neighbors in 
R. 

Theorem 4 (Ta-Shma, Umans, Zuckerman [7]). For every K > 0, e > and any 

se£ of vertices L, there exists an explicit (K,e)-disperser (L,R,E) with left degree 
poly(log|L|) and \R\ = Vt{Kd/ log 3 \L\). 

We can use Theorem [J] to obtain a weakly expanding graph with the following 
handy parameters. 

Lemma 5. For all but finitely many k, and for any set L with \L\ > 2 k , there is 
an explicit bipartite graph (L, R, E) whose 

• left degree is polynomial in log \L\, 

• \R\ = 2 k+2 , 

and such that any subset of L of size at least 2 k has at least 2 k neighbors in R. 

Proof. Apply Theorem H] with K = 2 k to obtain a (2 k , l/2)-disperser graph 
(L, R', E') with L being any set of size at least 2 k , left degree log 4 \L\ -poly(log |L|), 
and 



\R'\ = n 



2 fc -log 4 |L|poly(log|L|) 



fl[2 k ■ fcpoly(ife)]. 



log 3 \L\ 

For all sufficiently large k, \R'\ > 2 k+2 . Partition R' into 2 k+1 equivalence classes 
each of size at most t = \\R'\/2 k+2 ~\ , and let R be a set consisting of one member 
from each of these equivalence classes. We then "merge" the edges E' so that R 
inherits R"s edges as follows: 

E = {(x, y) : x € L, y E R, and 

x is the neighbor of some vertex in y's equivalence class}. 

The left degree of (L, R, E) is 

[left degree of (L, R' , E')] ■ t = poly(log |L|). 

Since (L,R',E f ) is a disperser, any subset S C L of size at least 2 k satisfies 
\E'(S)\ > \R'\/2, whence 

\ E (S)\ >\^A>-M%>2 k . □ 
t ~ \R'\/2 k + 1 ~ 

By employing the probabilistic method, one can obtain the following ob- 
ject which will be useful in our construction for small values k. Previously, 
Musatov, Romashchenko, and Shen [6] constructed a non-explicit bipartite graph 
with weaker parameters. 
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Lemma 6 (Bauwens, Makhlin, Vereshchagin, Zimand pQ). For every k > 0, there 
exists a (non- explicit) bipartite (2 k , 1)- expander graph G = (L,R,E) such that 

• L consists of all binary strings of length at least k, 

• the cardinality of R is 2 k+3 , and 

• the degree of each vertex x £ L is 0(\x\). 

Moreover, the all the strings of lengths greater than 2 k all have 2 k arbitrarily chosen 
neighbors, so neighbors in G can be computed efficiently given a finite amount of 
information regarding strings of length less than 2 k . 

As discussed at the end of this paper, and as established in pQ, one can use 
Lemma[6]to show it is possible to compute from any string x a list of length 0(|x| 2 ) 
containing a description for x whose length exceeds the shortest possible descrip- 
tion for x by no more than a constant number of bits. 

3 Explicit online matching 

We now present the main theorem. The key observation is the following lemma. 

Lemma 7. For all sufficiently large k, there exists an explicit bipartite graph 
(L, R, E) such that 

• L consist of all binary strings of length greater than k, 

• the cardinality of R is 2 k+2 , 

• the degree of each vertex x £ L is poly(|x|), and 
which is a (2 k , 1) -expander. 

Proof. In the following explicit construction, we effectively simulate the proba- 
bilistic method in two phases. First, we use Guruswami, Umans, and Vadhan's 
explicit, unbalanced expander graph construction to spread the neighbors of the 
left-hand nodes "uniformly" across a small middle vertex set M (Theorem [2]). 
Next we take this uniform spread of neighbors M and map it uniformly to an 
even smaller set of vertices, namely the right-hand vertices R, via a disperser 
(Lemma [5]) . The edges E of our desired graph will consist of those pairs in L and 
R which are connected by the composition of these two mappings. In this way, we 
simulate a "random" choice of edges between L and R. 

We now discuss how to handle strings of different lengths. We will treat the 
strings of length greater than 2 k separately, and for each length n in the remaining 
range, we create an explicit bipartite expander between strings of length n and an 
intermediate set M n . We then disperse the union of the M n 's into the set R, and 
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the desired set of edges E will consist of all pairs (x,y) € L x R which are joined 
via a path in the composition of these two graphs. 

As the final step of our construction of (L,R,E), we will add 2 k neighbors 
for each string of length more than 2 k , giving them each polynomial degree as 
well as the desired size-preserving property. We mention this up front in order 
to reduce our problem to a construction for strings with length at most 2 k . One 
complication that arises is that we do not know how many of the s input strings 
will have size bounded by 2 k , and so our disperser will "guess" this quantity by 
having a separate set of edges for each possible (approximate) size. 

Let L be as in the assumption of this lemma, Let k be sufficiently large so as 
to satisfy the hypothesis of Lemma El let a > 0, let e = 1/2, and let L n denote the 
strings of length n. We generate an explicit expander graph G n = (L n , M n , A n ) 
for each string length n < 2 k and each approximation size and L n denotes the 
strings of length n. Apply Theorem [2] to obtain G n with left degree 

d n = 0[{nk) l+l l% 

right vertex cardinality 

\M n \ <d 2 n -2 k ^ +a \ 

and which is a [2 k , (l/2)cZ n ]-expander. Note that d n is both poly(n) and bounded 
by poly(2 fc ). Using the fact d n increases with n, for each n we embed all of the 
M n 's into a set M of cardinality |M 2 fc|. Let A denote the corresponding edges of 
this embedding, that is, 

A = {(x, y) : x € L, y € M and (x, y) £ M n for some n}. 

Now (L, M, A) is a bipartite graph where the left degree of each length n string is 
d n , and 

2 fc 

|M| = \ M n\ < ( 2fc + 1) ■ d l ■ 2 fc(1+Q) < 0[(2 k + 1) • (2 fc • k) 1+1 / a ■ 2 k{1+a \ 

n=0 

hence log \M\ = 0(k). 

We now connect M to R via LemmaO For any size 2 t subset TCI with t < k, 
we have \A(T)\> 2 l ~ l by the expansion property of (L, M, A). By Lemma[5j there 
exist, for all integers < t < k, explicit bipartite graphs (M,R t ,B t ) whose left 
degree, poly (log |M|), is polynomial in k, whose right vertex set satisfies \Rt\ = 
2* +2 , and such that any subset of M of size at least 2* has at least 2* neighbors in 
R. We embed all of the R^s into a set R of cardinality 2 k+1 which inherits all the 
corresponding edges: 

B = {(x, y) : x G M, y € R, and (x, y) € B t for some t < k}. 
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Finally, E consists of all edges between L and R which are connected through M 
by composition of the edge sets A and B. That is, 



E = {(x, y) : x S L, y G R, and z € M with (x, z) € A and (z, y) € B}. 

This concludes the construction of the bipartite graph (L, R, E), whose left degree 
for each string of length n is max{0(n), d n ■ poly(log |M|)} = poly(n). 

Let S be a subset L of size s < 2 k . We wish to show that \E(S)\ > s, but for 
the moment let us settle for half of this: l-E^S 1 )! > [^1/2- Let Q be the strings is 
S of length greater than 2 k , let S' = S \ Q, and let to be the greatest integer less 
than k satisfying 2*° < \S'\. Then 

\E(S)\ = \E(Q)\ + \E(S')\ > \Q\ + 2*o > M + ^ = M. 

We have thus proved the intermediate statement: 

For all sufficiently large k, there exists an explicit bipartite graph (L, R, E) 
such that L consists of all binary strings, the cardinality of R is at most 
2 k+1 , the degree of each vertex x € L is poly(|xj), and such that any 
size s < 2 k subset of L has at least s/2 neighbors in R. 

It remains to boost the number of neighbors from s/2 to s. For each i < k, apply 
the statement above to achieve a bipartite graph Gi = (L, R' { , E^) with \R' { \ = 2 l+l 
and with degree of each length n string in L being poly(?i) which is a (2*, 1/2)- 
expander. We construct the R'^s so as to be pairwise disjoint. Let R' = \J R ■ and 
E' = \J E^. We claim that G = (L,R',E') satisfies the conclusion of the lemma. 
The left degree of G is no more than the product of the left degrees for the GVs, 
and since there were only k Gj's, the degree of each string in L is polynomial in 
its length. Furthermore, 

k 

\R'\ =^|^| =2 fc+2 -l. 
j=0 

Finally, let S be a subset of L of size at most k. By construction, > 15*1/2*, 

and therefore 

\E'(S)\ = £ |^(5)| >|S|-£i = \S\ -1. 

i=0 i=0 

In order to increment the size of E'(S) by one, add one new vertex to R' and add 
an edge to it from each vertex in L. Call the resulting set of right-hand vertices 
R" and the resulting set of edges E". Then (L, R" , E") satisfies the conclusion of 
the lemma. □ 
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The next definition and the argument in the next proof are due to Musatov, 
Romashchenko, and Shen [6], but we include them here for clarity and complete- 
ness. 

Definition 8. We say that a bipartite graph (L, R, E) admits online matchings up 
to size s if there exists an algorithm such that for any set of vertices in L of size s, 
whose vertices are (adversarially) presented to the algorithm one at a time, the 
algorithm can assign each vertex in order received to one of its neighbors (without 
knowing what comes next), and the overall assignment after all < s elements is 
a bijection. An online matching is efficient if a neighbor can be selected in time 
linear in the degree of the input. 

We will use the following result to derive a result about Kolmogorov complexity. 

Explicit Online Matching Theorem. For every k > 0, there exists an explicit 
bipartite graph G = (L, R, E) such that 

• L consist of all binary strings of length at least k, 

• the cardinality of R is at most 2 k+4 , 

• the degree of each vertex x £ L is poly(|x|), and 
G admits efficient online matching up to size 2 k . 

Proof. Let L be as in the hypothesis, and let k > 0. We build our construction in 
one of two ways, depending on whether k is large or small. For k sufficiently large, 
apply LemmaUJto obtain, for each integer i < k, an "offline" matching: a bipartite 
graph (L,Ri,Ei) where Ri is a set of vertices with cardinality 2* +2 , the left degree 
is polynomial in the string length, and any subset of L of size s < 2 l has at least 
s neighbors. For the finitely many k which do not meet the size requirements of 
Lemma we use Lemma [6] to hardwire similar offline matchings for i < k but 
with \Ri\ = 2 i+3 . 

For the sake of emergency, we make an extra bipartite graph (L, R—i, J5_i) 
with being a singleton set with every vertex in L connected to it, and E-\ 
reflects this relation. We build the R48 pairwise disjoint. Let R = Uj>-i ^ 
and E = (J i> _ 1 ^j. We claim that the explicit bipartite graph (L,R,E) has the 
required properties to establish the theorem. Since the degree of each vertex in 
x € R is the sum of the degrees for x over all Ri's, we see that the degree of 
each length n string is at most {k + 1) • poly(n), which is still polynomial in n. 
Furthermore, 

k 

\r\ = 1^1 - 2fc+4 - 

i=-l 

It remains to verify the efficient online matching property. We apply a greedy 
algorithm. Originally, all vertices in R are marked as unused. When a vertex 
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x € L comes in, we assign it to an arbitrary unused neighbor Rk, if such a neighbor 
exists. If not, we attempt to assign x to an unused neighbor in Rk-i- If this is 
not possible, we try for an unused neighbor in Rk-2-> etc. When (and if) x gets 
assigned to a particular y G R. L , we mark y as used and wait for the next vertex 
in L to arrive. Successive arrivals are handled similarly. 

We claim that every x gets assigned through this method, and since each x 
is assigned to an unused vertex, the resulting matching will be a bijection. We 
argue that no more than 2 4_1 vertices in L may fail assignment at level Ri for any 
i > 0. Suppose this bound were exceeded, and let X C L denote those vertices 
which failed assignment at level Ri. Then |X| > 2 1 " 1 , and so by construction 
|-E-i(^OI > 2*" 1 . But now less than 2* — 1 vertices in R were ever used, so some 
element in Ei{X) must not have been used. This is impossible because only those 
vertices in L which had no unused neighbors can fail assignment, yet some vertex 
in X has an unused neighbor, contrary to the definition of X. 

Thus at most 2 fc_1 vertices fail assignment at level Rj~, and of these at most 
half fail assignment at level Rk-i, and so by induction, at most 2° vertices have 
failed assignment at all levels Ri for i > 0. The remaining vertex in is used to 
assist with the recalcitrant vertex, if needed. Hence all vertices are matched. □ 

Let Cjj denote plain Kolmogorov complexity with respect to a machine U. 
That is, Cu(x) = min{|p| : U(p) = x}. Let Mq, Mi, ... be an effective enumeration 
of all Turing machines. Let (•, •) denote an efficient encoding of pairs of strings 
whose output has length which is function of the first coordinate's length plus the 
second coordinate's length, and define a machine U such that U((e,x)) = M(x). 
Now for any machine M, the standard universal machine U satisfies Cjj(x) < 
Cm{x) + O(l), see [5] for more details. For the remainder of this manuscript, let 
C = Cjj denote the Kolmogorov complexity of this standard universal machine U. 

Corollary 9. There exists a polynomial-time computable function which maps 
each binary string x to a poly(|x|) size list containing a length C(x) + 0(1) de- 
scription for x. 

Proof. Define a (not necessarily polynomial-time) machine M which does the fol- 
lowing. Dovetail on all computations U{p), and as each one converges, apply the 
Explicit Online Matching Theorem (EOMT) with k = \p\ to match the value 
U(p) = y with a right hand vertex z, which we interpret as a string of length 
\p\ + 4. Then set M(z) = y. Thus whenever p is a ^/-shortest description for y, 
there is a string z of length \p\ + 4 which is a neighbor of y in the EOMT graph and 
satisfies M(z) = y. Now M = M e for some index e on the list of Turing machines 
described above in the construction of U. Hence for all strings y with C(y) < \y\, 
the efficiently computable set 

{(e, z) : z is a neighbor of y in some EOMT graph k < \y\} 
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contains a [/-description (e, z) for y with \z\ < C(y) + 4. In order to cover the 
case C(y) > \y\, we add an extra description to this set, namely {i,y), where 
is the identity map on all strings. □ 

If we are not concerned with the running time of our list-building algorithm, 
then the degree of each vertex of length n in the Explicit Online Matching Theorem 
can be reduced from poly(|x|) to 0(|x| 2 ) by applying Lemma[6]for not just small k 
but all values k. Winding through the argument in Corollary[9]with this alternative 
Explicit Online Matching Theorem yields a computable function which maps each 
binary string x to an 0(|x| 2 ) size list containing a length C{x) + 0(1) description 
for x. This is as small as the list could possibly be [I]. 

Acknowledgment. The author is grateful to Marius Zimand for his observation, 
Lemma|5l which improved the error term in Corollary [9] from 0[logO(x)] to 0(1) 
and simplified the proof of Lemma [7J 
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